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Abstract 

The Restricted Isometry Property (RIP) is a fundamental property of a matrix enabling 
sparse recovery [CRT06] . Informally, an m x n matrix satisfies RIP of order k in the £ p norm if 
|| Ar|| p ss \\ x \\p f° r an y vector x that is fc-sparse, i.e., that has at most k non-zeros. The minimal 
j^ ■ number of rows m necessary for the property to hold has been extensively investigated, and 

tight bounds are known. Motivated by signal processing models, a recent work of Baraniuk et 
al [BCDH10J has generalized this notion to the case where the support of x must belong to a 
given model, i.e., a given family of supports. This more general notion is much less understood, 
£Nl ■ especially for norms other than £ 2 . 

In this paper we present tight bounds for the model-based RIP property in the £\ norm. Our 
bounds hold for the two most frequently investigated models: tree-sparsity and block-sparsity. 
We also show implications of our results to sparse recovery problems. 
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,^, l 1 Introduction 

In recent years, a new "linear" approach for obtaining a succinct approximate representation of 
^Zu \ n-dimensional vectors (or signals) has been discovered. For any signal x, the representation is equal 

to Ax, where A is an m x n matrix, or possibly a random variable chosen from some distribution 
over such matrices. The vector Ax is often referred to as the measurement vector or linear sketch 
of x. Although m is typically much smaller than n, the sketch Ax often contains plenty of useful 
information about the signal x. 

A particularly useful and well-studied problem is that of stable sparse recovery. We say that 
a vector x' is A;-sparse if it has at most k non-zero coordinates. The sparse recovery problem is 
typically defined as follows: for some norm parameters p and q and an approximation factor C > 0, 
k>( ■ given Ax, recover an "approximation" vector x* such that 

<_i 

\\x — x*\\„<C min \\x — x'\\ (1) 

r fc-sparse x' H 

(this inequality is often referred to as £ p /£ q guarantee). Sparse recovery has a tremendous number of 
applications in areas such as compressive sensing of signals [CRT06, Don06j, genetic data acquisition 
and analysis and data stream algorithms [Mu t05| IGI10] . 

It is known |CRT06| that there exist matrices A and associated recovery algorithms that produce 
approximations x* satisfying Equation ([TJ with p = q = IjQJ constant approximation factor C, and 
sketch length 

m = 0{klog(n/k)) (2) 
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1 In fact, one can prove a somewhat stronger guarantee, referred to as the ^2/^1 guarantee. 



This result was proved by showing that there exist matrices A with m = 0{k\og{n/k)) rows that 
satisfy the Restricted Isometry Property (RIP). Formally, we say that A is a (k,e)-RIP-p matrix, if 
for every x G W 1 with at most k non-zero coordinates we have 

(l-e)||x|| p <||Ac|| p <(l + e)||x||„. 

The proof of [CRT06] proceeds by showing that (i) there exist matrices with m = 0(klog(n/k)) rows 
that satisfy (k, e)-RIP-2 for some constant e > and (ii) for such matrices there exist a polynomial 
time recovery algorithm that given Ax produces x* satisfying Equation [TJ Similar results were 
obtained for RIP-1 matrices |BGI + 08J . The latter matrices are closely connected to hashing-based 
streaming algorithms for heavy-hitter problems, see [0110] for an overview. 

It is known that the bound on the number of measurements in Equation (J2j) is asymptotically 
optimal for some constant C and p = q = 1, see |BIPW10| and [FPRU10] (building on |Don06| 
GG84, Glu84j IKas77| ). The necessity of the "extra" logarithmic factor multiplying k is quite un- 
fortunate: the sketch length determines the "compression rate", and for large n any logarithmic 
factor can worsen that rate tenfold. Fortunately, a more careful modeling offers a way to overcome 
the aforementioned limitation. In particular, after decades of research in signal modeling, signal 
processing researchers know that not all supports (i.e., sets of non-zero coordinates) are equally 
common . For example, if a signal is a function of time, large coefficients of the signal tend to occur 
consecutively. This phenomenon can be exploited by searching for the best /c-sparse approxima- 
tion x* whose support belongs to a given "model" family of supports .Mfc (i.e., x* is Aik- sparse). 
Formally, we seek x* such that 
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\\x — x L < G • mm \\x — x \\ a 16) 

supp(a:')CT 

TeM k 

for some family A4& of fc-subsets of [n]. Clearly, the original /c-sparse recovery problem corre- 
sponds to the case, when .Mfc is a family of all fc-subsets of [n]. 

A prototypical example of a sparsity model is block sparsity [EM09J . Here the signal is divided 
into blocks of size b, and the non-zero coefficients belong to at most k/b blocks. This model is 
particularly useful for bursty time signals, where the "activity" occurs during a limited time period, 
and is therefore contained in a few blocks. Another example is tree sparsity [RCB01] which models 
the structure of wavelet coefficients. Here the non-zero coefficients form a rooted subtree in a full 
binary tree defined over the coordinateso For many such scenarios the size of the family .M& is 
much smaller than (?) , which in principle makes it possible to recover an approximation from fewer 
measurements. 

An elegant and very general model-based sparse recovery scheme was recently provided in a 
seminal work of Baraniuk et al [BCDHlOj. The scheme has the property that, for any "compu- 
tationally tractable" family of supports of "small" size, it guarantees a near-optimal sketch length 
m = 0(k), i.e., without any logarithmic factors. This is achieved by showing the existence of ma- 
trices A satisfying the model-based variant of RIP. Formally, we say that A satisfies e-A4fc-RIP-p 
if 

(l-e)||x|| p < ||Ac|| p <(l + e)|H| p (4) 

for any .M^-sparse vector x G R n . 



2 See Section [5] for formal definitions of the two models. 



In [BCDH 10] it was shown that there exist matrices with m = 0(k) rows that satisfy e-Aik~ 
RIP-2 as long as (i) either A4k is the block-sparse model and b = $7(logra) or (ii) Mk is the 
tree-sparse model. This property can be then used to give an efficient algorithm that, given Ax, 
finds x* satisfying a variant of the guarantee of Equation [3j However, the guarantees offered 
in [BCDH10J, when phrased in the £i/£\ framework, results in a super-constant approximation 
factor C = 0(ylogn) [IPllj . The question of whether this bound can be improved has attracted 
considerable attention in signal processing and streaming communities. In particular, one of the 
problem^ listed in the Bertinoro workshop open problem list |B1111) asks whether there exist 
matrices A with m = 0(k) rows that provide the ^i/^i guarantee for the tree-sparse model with 
some constant approximation factor C. 

Our results In this paper we make a substantial progress on this question. In particular: 

1. For both block-sparse and tree-sparse models, we show that there exist mxn matrices A that 
provide the £i/£\ guarantee for some constant approximation factor C, such that the number 
of measurements improves over the bound of Equation [2] for a wide range of parameters k and 
b. In particular we show that for the block-sparse model we can achieve m = 0(k\og k n) as 
long as b = w(logn) and k > lb. This improves over the 0(klog(n/k)) bound of Equation [2] 
for any k in this range. In particular, if k = n W , we obtain m = O(k). For the tree-sparse 
model we achieve m = 0(k log(n/k)/ log \og(n/k)) as long as k = w(log n). This also improves 
over the 0(klog(n/k)) bound of Equation [2j 

We note, however, that our results are not accompanied by efficient recovery algorithms. 
Instead, we show the existence of model-based RIP-1 matrices with the given number of rows. 
This implies that Ax contains enough information to recover the desired approximation x* 
(see Section [A] for more details). 

2. We complement the aforementioned results by showing that the measurement bounds achiev- 
able for a matrix satisfying block-sparse or tree-sparse RIP-1 property cannot be improved 
(i.e., our upper bounds are tight). This provides strong evidence that the number of measure- 
ments required for sparse recovery itself cannot be O(k). 

Our results show a significant difference between the model-based RIP-1 and RIP-2 matrices. 
For the £2 norm, the original paper [BCDH10J shows that the number of measurements is fully 
determined by the cardinality of the model. Specifically, their proof proceeds by applying the 
union bound over all elements of M k on top of the Johnson-Lindenstrauss-type concentration 
inequality. This leads to a measurement bound of m = 0(k + log|.Mfc|), which is 0(k) for the 
tree-sparse or block-sparse models. In contrast, in case of the t\ norm our lower bounds show that 
such a "cardinality-based" argument does not apply, and the number of rows needed to achieve 
the RIP-1 property is substantially higher than 0{k). For instance, the tree-sparse case with 
k = w(logre) gives an almost optimal separation between the number of rows: 0{k) for p = 2 and 
0(fclog(n/fc)/loglog(n//c)) for p = 1. 

Our techniques Our lower bounds are obtained by relating RIP-1 matrices to novel combina- 



torial/geometric structures we call generalized expanders. Specifically, it is known |BGI + 08 that 



See Question 15: Sparse Recovery for Tree Models. The question was posed by the first author. 



any binary 0-1 matrix A that satisfies (fc,e)-RIP-l is an adjacency matrix of an unbalanced (k,s)- 
expander (see Section [2] for the formal definition) . The notion of a generalized expander can be 
viewed as extending the notion of expansion to matrices that are not binary. Formally, we define it 
as follows. 

Definition 1 (Generalized expander). Let A be anmxn real matrix. We say that A is a generalized 

(k, e)-expander ; if all A's columns have l\-norm at most 1 + e, and for every S C [n] with \S\ < k 

we have 

\ max \a,ij\ > \S\ ■ (1 — e). 
*—* ' j&S 

i£[m] 

Observe that the notion coincides with the standard notion of expansion for binary 0-1 matrices 
(after a proper scaling). 

In this paper we show that any (not necessarily binary) RIP-1 matrix is also a generalized 
expander. We then use this fact to show that any RIP-1 matrix can be sparsified by replacing most 
of its entries by 0. This in turn lets us use counting arguments to lower bound the number of rows 
of such matrix. 

Our upper bounds are obtained by constructing low-degree expander-like graphs. However, we 
only require that the expansion holds for the sets from the given model -M&. This allows us to 
reduce the number of the right nodes of the graph, which corresponds to reducing the number of 
rows in its adjacency matrix. 

2 Definitions 

In this section we provide the definitions we will use throughout the text. 

Definition 2 (Expander). Let G = (U,V,E) with \U\ = n, \V\ = m, E C U x V be a bipartite 
graph such that all vertices from U have the same degree d. Then we say that G is a (k, e)-expander ; 
if for every S C U with \S\ < k we have 

\{veV\3ueS (u,v) eE}\> (l-e)d\S\. 

Definition 3 (Model). Let us call any non-empty subset 

M k <^Z k = {A<Z [n] | \A\ = k} 

a model. 

In particular, £& is a model as well. 

Definition 4 (Block-sparse model). Suppose that b, k € [n\. Moreover, b divides both k and n. Let 
us partition our universe [n] into n/b disjoint blocks B\, E>i, ■ ■ ■ , B n /), of size b. We consider the 
following block-sparse model: Bk,b consists of all unions of k/b blocks. 

Definition 5 (Tree-sparse model). Suppose that k G [n] andn = 2 h+1 — 1, where h is a non-negative 
integer. Let us identify the elements of [n] with the vertices of a full binary tree of depth h. Then, 
tree-sparse model 7fc consists of all subtrees of size k that contain the root of the full binary tree. 



Definition 6 (Model-sparse vector/set). Let jV(j.CI]|. be any model. We say that a set S C [n] is 
.Mfc-sparse 7 if S lies within a set from, Aik- Moreover, let us call a vector x £ M. n .A/ffc-sparse, if its 
support is a Mk-sparse set. 

It is straightforward to generalize the notions of RIP-p matrix, expanders and generalized ex- 
panders to the case of .Mfc-sparse vectors and sets. Let us call the corresponding objects e-M^-RlP-p 
matrix, e-M^- expander and generalized e-M^-expander, respectively. Clearly, the initial definitions 
correspond to the case of S^-sparse vectors and sets. 

Our two main objects of interest are Bk b~ an d 7jfc-RIP-l matrices. 



3 Sparsification of RIP-1 matrices 

In this section we show that any nxm matrix, which is (k, e)-RIP-l, can be sparsified after removing 
(1 — Cl(l))n columns (Theorem [T]) . Then we state an obvious generalization of this fact (Theorem [2]) , 
which will be useful for proving lower bounds on the number of rows for S& &- an d 7fc-RIP-l matrices. 

Theorem 1. Let A be any m x n matrix, which is (k,e)-RIP-l. Then there exists an m x Q.(n) 
matrix B which is (k, 0(e)) -RIP-1, has at most 0(m/k) non-zero entries per column and can be 
obtained from A by removing some columns and then setting some entries to zero. 

We prove this theorem via the sequence of lemmas. First we prove that for every matrix A there 
exists a ±l-vector x such that ||Ax||i is small. 

Lemma 1. Let A be any m x k matrix. Then there exists a vector x £ { — 1, 1} such that 

ii^iii<e(E4) ■ (5) 

ie[m] \j£[k] ) 

Proof. Let us use a probabilistic argument. Namely, let us sample all coordinates X\ independently 
and uniformly at random from { — 1,1}. Then 
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Thus, there exists a vector x £ { — 1, 1} that satisfies ([5]) 
As a trivial corollary we have the following statement. 
Corollary 1. Let A be any mx k matrix that preserves (up to l±e) l\-norms of all vectors. Then 

1/2 
ie[rn\ \je[k] 



The next lemma shows that every (A;,e)-RIP-l matrix is a generalized (fc,0(e))-expander. This 
is a generalization of a theorem from |BGI + 08 . 



Lemma 2. Let A be any mx n matrix, which is (k,e)-RIP-l. Then, A is a generalized {k, 3e)- 
expander. 

Proof. For the proof we need the following lemma. 
Lemma 3. For any y G R. k 

lli/IU — II2/II00 < (i + -^)(l|y||i-||y||2). (6) 

Proof. Clearly, if y = 0, then the desired inequality is trivial. Otherwise, by homogenity we can 
assume that \\y\\\ = 1. If \\y\\oo = 1) then ||y||2 = 1, and both sides of (jfjj) are equal to zero. So, 
we can assume that ||y||oo < 1- Suppose that \\y\\oo = t for some t G (0; 1). If \jn > t > l/(n + 1) 
(thus, n = \l/t — 1]) for some positive integer n, then, clearly, \\yW2 < \/nt 2 + (1 — nt) 2 . One can 
check using elementary analysis that for every t G (0; 1) 

1 — llvlloo 1 — t 1 

< , < 1 + —!= 



vh 1 _ . /n _ 1 1 +2 , h _ n _ii^ 2 \/2 



1- v^-ii* 2 +(i- ri-11*) 

(equality is attained on t = 1/2). This concludes the proof. □ 

Let S C [n] be any subset of size at most k. For any i G [m] let us denote j/j = (a-ij)j^s S M . 
We have 



X) WViWoo > i 1 + ^k) ^2 hih ~ -^ ■ ^2 hih (by Lemma 



3) 



id \m\ i£ \rn\ i(L\m 



> (l + -^=\(l-e)\S\--^=-(l + e)\S\ (by Corollary Hand RIP-1) 

= (l-(l + V2)e)\S\. 

So, A is a generalized (k, (1 + V2)e)-expander. Since 1 + \/2 < 3, this concludes the proof. 

□ 

Finally, we prove Theorem [TJ 

Proof of Theorem^ By Lemma [2^4 is a generalized (k, 3e)-expander. Let us partition [n] into n/k 
disjoint sets of size k arbitrarily: [n] = S\ U S2 U . . . U S n /j s: . Now for every % G [m] and every St 
let us zero out all the entries a%j for j G St except one with the largest absolute value. Let A' be 
the resulting matrix. Since A is a generalized (k, 3e)-expander, we know that the (vector) t\ norm 
of the difference A — A' is at most 3en. Thus, each column of A — A' has the t\ norm of at most 
3e on the average. The number of non-zero entries in ^4'is at most mn/k, so a column has at most 
m/k non-zero entries on the average. Thus, by Markov inequality there is a set of n/3 columns such 
that we have moved each of them by at most 9e and each of them contains at most 3m/ k non-zero 
entries. We define a matrix B that consists of these columns. Since we have modified each of these 
columns by at most 9e and A is (k, e)-RIP-l we have that B is (k, 10e)-RIP-l. □ 
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The following theorem is a straightforward generalization of Theorem [TJ It can be proved via 
literally the same argument. 

Theorem 2. Suppose that a model Aik f= E^ has the following properties: 

• for some I < k all sets from £; are AA^-sparse; 

• there exists a partition of an £1(1) -fraction of [n] into disjoint subsets of size fl(k) such that 
each of these subsets is Mk-sparse. 

Then if A is an m x n matrix which is e-Mk-RIPA for some sufficiently small e > 0, there exists 
an m x fl(n) matrix B which is (l,0(e))-RIP-\, has at most 0(m/k) non-zero entries per column 
and can be obtained from A by removing some columns and then setting some entries to zero. 

4 Lower bounds for model-based RIP-1 matrices 

In this section we prove lower bounds on the number of rows for S& &- and 7fc-RIP-l matrices. 
This is done using the following general theorem. 

Theorem 3. // a model M^ C X/% satisfies the statement of Theorem^ and A is an m x n matrix 
which is e-Mk-RIPA for some sufficiently small e > 0, then 

m = nik. log{n/k) ^ 



log(fc/0 J • 

The proof is a combination of Theorem [2] and a counting argument similar to one used in |NaclO| . 
First, we need the following standard geometric fact. 

Theorem 4. Let V\,V2, ■ ■ ■ ,v n € R be a set of d- dimensional vectors such that 

• for every i S [n] we have \\vi\\i < 1.1; 

• for every i ^ j G [n] we have \\v{ — Vj\\i > 0.9. 

Then, n < A d . 

Proof. Denote B(x,r) the ball in £i-metric with center x and radius r. Consider the balls Bi = 
B(vi,0A5). On the one hand, these balls are disjoint, on the other hand, they lie within B(0, 1.55). 
Thus, if we consider the balls' volumes we see that 

( ' L55 V Ad 

n < — — < 4T. 



0.45. 

□ 

The next theorem shows a tradeoff between m and column sparsity for any RIP-1 matrix. Its 
variant was proved in |NaclO| . but we present here the proof for the sake of completeness. 

Theorem 5 ( |NaclO| ). Let A be an my. n matrix, which is (k,e)-RIP-l for some sufficiently small 
e > 0. Moreover, suppose that every column of A has at most s non-zero entries. Then 

slog S) =0 ( log (i 



Proof. We need a lemma from |NaclO| . which is proved by a standard probabilistic argument. 
Lemma 4. There exists a set X C 1" of k/2-sparse vectors such that 

• log|X| =n(JfcIog(n/fe)); 

• every vector from X has a unit i\-norm; 

• all pairwise t\- distances between the elements of X are at least 1. 

Now let us see how A acts on the elements of X. Clearly, for every x £ X the vector Ax is 
s/c-sparse. By pigeonhole principle we have that for some S C [m] with \S\ < sk there exists a 
subset I'CI with 

\*'\ > S (7) 

such that for every a; E X' the support of Ax lies within 5. 

On the other hand, since A is (k, e)-RIP-l one can easily see that the set {Ax} xeX , (which lies 
in the s/c-dimensional subspace) has the following properties: 

• every vector from the set has £i-norm at most 1 + e; 

• all pairwise distances are at least 1 — e. 

Since this set lies in the sfc-dimensional subspace by Theorem U its cardinality is bounded by 4 
(provided that e is sufficiently small). Thus, we have by plugging this bound into 



2fi(A;log(n/fc)) 



< 4 sk . 



\sk) 

Now by using a standard estimate r?) < 2 ' sklo s( m / sk )) we have the desired statement. □ 

Now we can finish the proof of Theorem [3j 

Proof of Theorem^ By Theorem [2] we can get an m x Q(n) matrix A with column sparsity s = 
0(m/k) and which is (I, 0(e))-RIP-l. Then applying Theorem [S] we have s\og(m/sl) = fl(log(n/l)). 
Since, s = 0(m/k) we get the desired bound 

/ log(n/m 
m "^V log(fc/0J' 

□ 

Next we apply Theorem [3] to Sfc&- and 7^-RIP-l matrices. 

Theorem 6. For any k > 2b and sufficiently small e > if A is an m x n matrix which is 
e-Bki-RIP-l, then m = £l{k\og k n). 

Proof. Clearly, if k > 2b, then Bfc & satisfies the conditions of Theorem [2] for I = 2. Thus, by 

Theorem [3] we have 

m = ^ A log(n/A0 \ 

V log k J 

□ 



Theorem 7. Let A be an m x n matrix which is e-Tk-RIP-1. Then, if e is sufficiently small and 
k = w(logn), 

m = n(k. log{n/k) 



\oglog(n/k) 

Proof. The next Lemma shows that for any k = w(logn) the model Tk satisfies the first condition 
of Theorem [2] with I = Q(k/\og(n/k)). 

Lemma 5. Let S C [n] be a subset of the full binary tree. Then there exists a subtree that contains 
both S and the root with at most 0(\S\ log(n/\S\)) vertices. 

Proof. Let T be a subtree that consists of log |5| levels of the full binary tree that are closest to the 
root. Let T" be a subtree that is a union of T and paths from the root to all the elements of \S\. It 
is not hard to see that \T' \T\ < \S\ log(n/\S\). As a result we get 

\T'\ < \T\ + \S\ log(n/\S\) < 0(\S\ log(n/|5|)). 

D 

The second condition of Theorem [2] is satisfied as well (here we use that k = w(logn)). Thus, 
applying Theorem [3] we have 

m = n(k- log{n/k) 



loglog(n/A;) / 

D 

5 Upper bounds for model-based RIP-1 matrices 

In this section we complement the lower bounds by upper bounds. 

We use the following obvious modification of a theorem from |BGI + 08J . 

Theorem 8 ( |BGI + 08J ). If a graph G = (U, V, E) is an e-M.^-ex])ander for some model M.^ C S^ ; 
then the normalized (by a factor of d, where d is the degree of all vertices from U) adjacency matrix 
of G (which size is \V\ x \U\) is an 0(e)-A^fc-i?/P-l matrix. 

Thus, it is sufficient to build Bk,b- an d 7fc-expanders with as small m as possible. We use the 
standard probabilistic argument to show the existence of such graphs. Namely, for every vertex 
u € U we sample a subset of [m] of size d (d has to be carefully chosen). Then, we connect u and 
all the vertices from this subset. All sets we sample are uniform (among all <i-subsets of [m]) and 
independent. 

We use the following tail inequality, which can be proved using Chernoff bound (and whose 
slight variant is proved and used in [BMRV02]). 

Lemma 6 ([BMRV02J). There exist constants C > 1 and 5 > such that, whenever m > Cdt/e, 
one has for any T C U with \T\ = t 

(Fill \ —£dt 
5-— J 



For the proof see Appendix (Section IB]) . 

For a model A4k C T,j, denote jf(Aik,t) the number of .Mfc-sparse sets of size t (for t £ [k]). We 
use the following simple estimates. 

Lemma 7. 

vteK #( Bk , h ,t) < min {pQ,('; 

Proof. The only non-trivial bound is the first bound for 7~k, which follows from the explicit formula 
for Catalan numbers. □ 

Now if we combine this Lemma with the standard estimate (") < (eu/v ) v , we get the following 
bounds. 



V*€[fe] #(T fe ,t) < minl^f^V,^) 4 } (9) 



V * 7 ' w 

Now we combine these estimates, Lemma [BJ Theorem [8] and union bound to get upper bounds 
for m for Bj- &- and 7fe-RIP-l matrices. 

Theorem 9. For any e > and 6 = cj(logn) £/iere exists an e-B^^-RIP-Y matrix with 

m = o(-^- log fc n J . 

Theorem 10. For any e > and A; = w(logn) there exists an e-Tk-RIP-1 matrix with 

. k log(n/fc) 

TO = CM -=■ • 



e 2 loglog(n/fc) 

The proofs are in the appendix. We note that the requirement that b is not too small is necessary. 
E.g., if we had 6=1, then the property is equivalent to the standard (k, e)-RIP-l matrix, for which 
the upper bound of 0{k\og k n) can not be achieved. 

We also would like to point out that the estimates on #(7fc,i) that we are using to prove 
Theorem [10] are true for any model with cardinality exp(fc). Thus, for any such model there exist 
RIP-1 matrices with 0(/clog(n/fc)/loglog(n/fc)) rows. 
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A RIP-1 yields sparse recovery 

In this section we show improved upper bounds on the number of measurements needed to recover 
good block- or tree-sparse approximations with £i/£\ guarantee and constant approximation factor. 
This result is folklore, but we include it for completeness. 

(2) 

Suppose that Aij. C E/% is some model. We say that an m x n matrix A is £-M k -RIP-1, if for 
every x £ M. n such that suppx C S± U S2 for some A'ffc-sparse sets Si and £2 one has 

(l-e)Wi < \\Ax\h <(l + e)lklli- 

(2) 
Let A be any £-M k -RIP-1 matrix for a sufficiently small e such that ||^4||i < 1 + e. AlgorithmQ] 

(whose running time is exponential in n) given y = Ax for some x G M. n recovers a vector x* £ R n 

such that 

||x-x*||i < (3 + 0(e))- min ||a;-x'||i. (10) 

x' is A4fc-sparse 

Note that the optimization problem within the for-loop can be easily reduced to a linear program. 

Algorithm 1 Model-based sparse recovery 

Input: y = Ax for some x G W 1 

Output: a good A^-approximation x* of x 

x* ^0 

for S C [n] is an A4fc-sparse set do 

x <r- argmin supp:c ,c5 \\v ~ Ax 'h 
if \\y — Ax\\i < \\y — Ax*\\i then 

x <— x 
end if 
end for 

Now let us prove that the resulting vector x* satisfies (llOp . Denote x_M k an A4fc-sparse vector that 
minimizes ||x — x_mJ|i. Now we have 

\\x - x*\\i < \\x - x Mk \\i + \\xM h ~ x *h < \\ x ~ x M k \\i + (1 + 0{e))\\A(x Mk ~ x *)\\i < 
< \\x - x Mk \\i + (1 + 0(e))(\\A(x - x Mk )\\i + \\A(x - x*)\\i) < 

< (2 + 0(e))||x - x Mk \\i + (1 + 0(e))\\A(x - x*)||i < 
< (2 + 0(e))\\x - x Mk \\x + (1 + 0{e))\\A{x - x Mh )\\i < (3 + 0(e))\\x - x Mk \\i- 
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(2) 

The second inequality is true, since A is E-AiY -RIP-1 and both % t and x* are .M^-sparse. The 
fourth inequality is true, because ||.A||i < 1 + e. The fith inequality is due to the construction of 
the algorithm: clearly, \\A(x — x*)||i < \\A(x — xj^ k )\\i. 

(2) 

It is immediate to see that any e-^fc^-RIP-l matrix is £-B k jJ-RIP-1. Similarly, any e-72/c-RIP-l 

(2) 

matrix is e-71 -RIP-1. Moreover, since all the singletons are both block- and tree-sparse, we have 
that these matrices have £i-norm at most 1 + e. Thus, plugging Theorems [9] and [10] we get the 
following result. 

Theorem 11. The problem of model-based stable sparse recovery with l\jt\ guarantee and a constant 
approximation factor can be solved 

• with 

m = 0(klog k n) 

measurements for B^^, provided that b = w(logn); 

• with 

m = 0[k. l ° g{n/k) 



log log(n/fe) 
measurements for Tk, provided that k = uj (log n). 

B Proof of Lemma [6] 

To prove the lemma we need the following version of Chernoff bound [MR95J. 

Theorem 12 ([MR95]). Suppose that X\, X2, .-., X n are independent binary random variables. 
Denote /x = E [X\ + X2 + . . . + X n ] . Then for any r > 

Pr [Xi + X 2 + . . . + X n > (1 + t)h] < 



(l + T )(l+r) 



We can enumerate all dt outgoing from T edges arbitrarily. Then let us denote C u the event 
"■u-th of these edges collides with u-th edge on the right side, where v < u". 

We would like to upper bound the probability of the event "at least edt of events C u happen". 
As shown in [BMRV02J this probability is at most 



Pr 



B [dt,— ) > edt 
m 



Thus we can apply Theorem [12] with r = em/dt — 1 and /i = (dt) 2 /m. Then, the statement of the 
Lemma can be verified routinely. 

C Proof of Theorem [9] 

If we combine Lemma [6] and union bound we see that it is sufficient to prove that 

(FTfl \ —Edt 1 

6 'H) < n' 
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provided that m > Cdk/e (here C and 5 are the constants from the statement of Lemma [6]). Now 
let us plug ([8]). Namely, for t 6 [1; k/b] we use the second estimate, for t £ [k/b; k] we use the first 
estimate. Thus, it is left to prove that 

*<™ (T)>fr*4 <»> 

V(elw] ( » ) ^(*)' ( ,.™ ) - < i (12) 

It is straightforward to check that, whenever d > 1/e, the left-hand sides of (|lip and (|12[) are 
log-convex. Thus, to check ([lip and (|12p . it sufficient to check them for £ = 1, k/b, k. 

Let us set m = C'dk/e for a constant C, which is much larger than C, and d = C" log fc n/e for 
some large constant C" . 

Let us first check (fTT|) for £ = 1. By substituting the values for m and d from the previous 
paragraph we have 

en(5C'k)- c " loSkn ^:- 
n 

for sufficiently large C and and C" . Now, let us check (PTTj) for £ = £;/&. We want to prove that 

— ^ (,5(7'5)-C"fclog fc n/& < 1 

k J n 

for sufficiently large C" and C". First, note that (enb/k) k ' b = O(n), so, it is sufficient to prove that 

flog(nb/k)' 

Or, equivalently, 

logn • log 6 = ri(log k ■ log(nb/k)). 

This inequality is immediate to check. 

One can check (|12p for t = k/b similarly. 

Finally, let us check (fT2~l) for t = k. We need to ensure that 

{en/k) k/b e k {5C')- C " kloSkn < 1/n. 

For this it is sufficient to have 

b= _ / log(ra/fc) ' 

V lo gfc n 

Since 6 = a; (logn), this is also true. 



D Proof of Theorem [10 



This theorem is proved similarly to the previous one. We use ([9]) as follows: for 1 < t < 2k/\og(n/k) 
we use the second estimate, for 2k/log(n/k) < t < k we use the first estimate. Again, it is sufficient 
to check that 

Vt 6 [1; 2k/ log(n/k)] (^) * (6 ~) "'* < ^ (13) 

/ pk\ / Fm\—£dt 1 

vte[2fc/iog(nA);*] 4 (tJ r'^) <n' (14) 
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Once again one can show that, whenever ed > 1, the left-hand sides of (|T3|) and (|14p are log-convex. 
Thus, it is sufficient to check these conditions for t = l,2k/\og(n/k),k. We set m = C'dk/e and 
d = C"log(n/k)/eloglog(n/k) for sufficiently large constants C and C" . 
Let us check (fl~3l) for t = 1. To upper bound 

en(5C / A;)- c " log ( n / fc )/ loslog ( n / fc ) (15) 

note that since fe = w(logn) one has log(n//s)/loglog(n/A;) = il(log fc n), so (|15p is much smaller 
than 1/n. 

Let us now check (I13p for t = 2k/log(n/k). We need to upper bound 

2enlog(n/k) Y n0ein/k \ 5C , los{n/k)r c''k/io R io K (n/k)_ 

Since fc = w(logn) one has 

2nlogn\ 2fe/log(n/fc) 

-IT- ) 

so it is sufficient to ensure that 

log(n/fc) = ^ f \og(n\og(n/k)/k)) \ 
loglog(n/fc) \ loglog(n/&) / 

but this inequality is immediate. 

Checking (|14p for t = 2k/ \og(n/k) can be reduced to the previous case, since the left-hand sides 
of (I13p and (I14p are equal in this case. 

Thus, it is left to check ([i"4"j) for t = k. We should upper bound 

(Ae) k (SC')~ c " k lo §( n / fc )/ lo s lo g( n / fc ) _ 

But clearly, since k = cj(logn), this quantity is much less than 1/n (for sufficiently large C and 
C"). 
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